confidence interval
Revisiting Active Sequential Prediction-Powered Mean Estimation
Sfyraki, Maria-Eleni, Wang, Jun-Kun
In this work, we revisit the problem of active sequential prediction-powered mean estimation, where at each round one must decide the query probability of the ground-truth label upon observing the covariates of a sample. Furthermore, if the label is not queried, the prediction from a machine learning model is used instead. Prior work proposed an elegant scheme that determines the query probability by combining an uncertainty-based suggestion with a constant probability that encodes a soft constraint on the query probability. We explored different values of the mixing parameter and observed an intriguing empirical pattern: the smallest confidence width tends to occur when the weight on the constant probability is close to one, thereby reducing the influence of the uncertainty-based component. Motivated by this observation, we develop a non-asymptotic analysis of the estimator and establish a data-dependent bound on its confidence interval. Our analysis further suggests that when a no-regret learning approach is used to determine the query probability and control this bound, the query probability converges to the constraint of the max value of the query probability when it is chosen obliviously to the current covariates. We also conduct simulations that corroborate these theoretical findings.
- North America > United States > California > Santa Clara County > Stanford (0.04)
- Oceania > New Zealand (0.04)
- North America > United States > Wisconsin > Dane County > Madison (0.04)
- Asia > Middle East > Jordan (0.04)
Covariance-Based Structural Equation Modeling in Small-Sample Settings with $p>n$
Hasegawa, Hiroki, Tamura, Aoba, Okada, Yukihiko
Factor-based Structural Equation Modeling (SEM) relies on likelihood-based estimation assuming a nonsingular sample covariance matrix, which breaks down in small-sample settings with $p>n$. To address this, we propose a novel estimation principle that reformulates the covariance structure into self-covariance and cross-covariance components. The resulting framework defines a likelihood-based feasible set combined with a relative error constraint, enabling stable estimation in small-sample settings where $p>n$ for sign and direction. Experiments on synthetic and real-world data show improved stability, particularly in recovering the sign and direction of structural parameters. These results extend covariance-based SEM to small-sample settings and provide practically useful directional information for decision-making.
- Asia > Japan > Honshū > Kantō > Ibaraki Prefecture > Tsukuba (0.05)
- South America > Colombia (0.04)
Scalable Model-Based Clustering with Sequential Monte Carlo
Trojan, Connie, Myshkov, Pavel, Fearnhead, Paul, Hensman, James, Minka, Tom, Nemeth, Christopher
In online clustering problems, there is often a large amount of uncertainty over possible cluster assignments that cannot be resolved until more data are observed. This difficulty is compounded when clusters follow complex distributions, as is the case with text data. Sequential Monte Carlo (SMC) methods give a natural way of representing and updating this uncertainty over time, but have prohibitive memory requirements for large-scale problems. We propose a novel SMC algorithm that decomposes clustering problems into approximately independent subproblems, allowing a more compact representation of the algorithm state. Our approach is motivated by the knowledge base construction problem, and we show that our method is able to accurately and efficiently solve clustering problems in this setting and others where traditional SMC struggles.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Europe > United Kingdom > England (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (4 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Covariance-adapting algorithm for semi-bandits with application to sparse rewards
Perrault, Pierre, Perchet, Vianney, Valko, Michal
We investigate stochastic combinatorial semi-bandits, where the entire joint distribution of outcomes impacts the complexity of the problem instance (unlike in the standard bandits). Typical distributions considered depend on specific parameter values, whose prior knowledge is required in theory but quite difficult to estimate in practice; an example is the commonly assumed sub-Gaussian family. We alleviate this issue by instead considering a new general family of sub-exponential distributions, which contains bounded and Gaussian ones. We prove a new lower bound on the expected regret on this family, that is parameterized by the unknown covariance matrix of outcomes, a tighter quantity than the sub-Gaussian matrix. We then construct an algorithm that uses covariance estimates, and provide a tight asymptotic analysis of the regret. Finally, we apply and extend our results to the family of sparse outcomes, which has applications in many recommender systems.
- Europe > Spain > Canary Islands (0.04)
- North America > United States > California (0.04)
- Europe > Iceland > Capital Region > Reykjavik (0.04)
- Europe > France > Hauts-de-France > Pas-de-Calais (0.04)
MCAnalysis: An Open-Source Package for Preprocessing, Modelling, and Visualisation of Menstrual Cycle Effects in Digital Health Data
Delray, Kyra, Lewis, Glyn, Grace, Bola, Hayes, Joseph, Evans, Robin
Digital Health Technologies (DHTs) including consumer wearable devices and digital health applications offer an opportunity for continuous, large-scale data collection. Wearables give insight into physiological biomarkers that help us understand the human body, through passive data collection. Such data can be collected at a regularity that would be impossible otherwise. Digital health applications provide the chance to collect diverse types of data from clinically validated surveys, GPS, and contextual inputs. This combination has the ability to make profound advances in our understanding of the factors that affect individuals on a personal and population level [Grace et al., 2025]. One of these factors is the menstrual cycle. Particularly because of its inter-individual variability, studying it requires large sample sizes, and to truly grasp its effects on the human body, it needs to be observed on a near-daily scale [Bull et al., 2019].
- Health & Medicine > Health Care Technology (1.00)
- Health & Medicine > Pharmaceuticals & Biotechnology (0.88)
- Health & Medicine > Therapeutic Area > Neurology (0.69)
- Health & Medicine > Consumer Health (0.69)
MEC: Machine-Learning-Assisted Generalized Entropy Calibration for Semi-Supervised Mean Estimation
Obtaining high-quality labels is costly, whereas unlabeled covariates are often abundant, motivating semi-supervised inference methods with reliable uncertainty quantification. Prediction-powered inference (PPI) leverages a machine-learning predictor trained on a small labeled sample to improve efficiency, but it can lose efficiency under model misspecification and suffer from coverage distortions due to label reuse. We introduce Machine-Learning-Assisted Generalized Entropy Calibration (MEC), a cross-fitted, calibration-weighted variant of PPI. MEC improves efficiency by reweighting labeled samples to better align with the target population, using a principled calibration framework based on Bregman projections. This yields robustness to affine transformations of the predictor and relaxes requirements for validity by replacing conditions on raw prediction error with weaker projection-error conditions. As a result, MEC attains the semiparametric efficiency bound under weaker assumptions than existing PPI variants. Across simulations and a real-data application, MEC achieves near-nominal coverage and tighter confidence intervals than CF-PPI and vanilla PPI.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > California > Alameda County > Berkeley (0.04)
- Europe > Russia (0.04)
- (2 more...)
Nonparametric Regression Discontinuity Designs with Survival Outcomes
Schuessler, Maximilian, Sverdrup, Erik, Tibshirani, Robert, Wager, Stefan
Quasi-experimental evaluations are central for generating real-world causal evidence and complementing insights from randomized trials. The regression discontinuity design (RDD) is a quasi-experimental design that can be used to estimate the causal effect of treatments that are assigned based on a running variable crossing a threshold. Such threshold-based rules are ubiquitous in healthcare, where predictive and prognostic biomarkers frequently guide treatment decisions. However, standard RD estimators rely on complete outcome data, an assumption often violated in time-to-event analyses where censoring arises from loss to follow-up. To address this issue, we propose a nonparametric approach that leverages doubly robust censoring corrections and can be paired with existing RD estimators. Our approach can handle multiple survival endpoints, long follow-up times, and covariate-dependent variation in survival and censoring. We discuss the relevance of our approach across multiple areas of applications and demonstrate its usefulness through simulations and the prostate component of the Prostate, Lung, Colorectal and Ovarian (PLCO) Cancer Screening Trial where our new approach offers several advantages, including higher efficiency and robustness to misspecification. We have also developed an open-source software package, $\texttt{rdsurvival}$, for the $\texttt{R}$ language.
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East > Syria > Aleppo Governorate > Aleppo (0.04)
- Research Report > Strength High (1.00)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
Debiased Estimators in High-Dimensional Regression: A Review and Replication of Javanmard and Montanari (2014)
High-dimensional statistical settings ($p \gg n$) pose fundamental challenges for classical inference, largely due to bias introduced by regularized estimators such as the LASSO. To address this, Javanmard and Montanari (2014) propose a debiased estimator that enables valid hypothesis testing and confidence interval construction. This report examines their debiased LASSO framework, which yields asymptotically normal estimators in high-dimensional settings. The key theoretical results underlying this approach are presented. Specifically, the construction of an optimized debiased estimator that restores asymptotic normality, which enables the computation of valid confidence intervals and $p$-values. To evaluate the claims of Javanmard and Montanari, a subset of the original simulation study and the real-data analysis is presented. The original empirical analysis is extended to the desparsified LASSO, which is referenced but not implemented in the original study. The results demonstrate that while the debiased LASSO achieves reliable coverage and controls Type I error, the LASSO projection estimator can offer improved power in idealized low-signal settings without compromising error rates. The results reveal a trade-off: the LASSO projection estimator performs well in low-signal settings, while Javanmard and Montanari's method is more robust to complex correlations, improving precision and signal detection in real data.
- Europe > Austria > Vienna (0.14)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > Canada > Ontario > Toronto (0.04)
Concept frustration: Aligning human concepts and machine representations
Parisini, Enrico, Soelistyo, Christopher J., Isaac, Ahab, Barp, Alessandro, Banerji, Christopher R. S.
Aligning human-interpretable concepts with the internal representations learned by modern machine learning systems remains a central challenge for interpretable AI. We introduce a geometric framework for comparing supervised human concepts with unsupervised intermediate representations extracted from foundation model embeddings. Motivated by the role of conceptual leaps in scientific discovery, we formalise the notion of concept frustration: a contradiction that arises when an unobserved concept induces relationships between known concepts that cannot be made consistent within an existing ontology. We develop task-aligned similarity measures that detect concept frustration between supervised concept-based models and unsupervised representations derived from foundation models, and show that the phenomenon is detectable in task-aligned geometry while conventional Euclidean comparisons fail. Under a linear-Gaussian generative model we derive a closed-form expression for Bayes-optimal concept-based classifier accuracy, decomposing predictive signal into known-known, known-unknown and unknown-unknown contributions and identifying analytically where frustration affects performance. Experiments on synthetic data and real language and vision tasks demonstrate that frustration can be detected in foundation model representations and that incorporating a frustrating concept into an interpretable model reorganises the geometry of learned concept representations, to better align human and machine reasoning. These results suggest a principled framework for diagnosing incomplete concept ontologies and aligning human and machine conceptual reasoning, with implications for the development and validation of safe interpretable AI for high-risk applications.
- Europe > United Kingdom (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- Information Technology > Security & Privacy (0.67)
- Law (0.46)
Online Statistical Inference of Constant Sample-averaged Q-Learning
Panda, Saunak Kumar, Li, Tong, Liu, Ruiqi, Xiang, Yisha
Reinforcement learning algorithms have been widely used for decision-making tasks in various domains. However, the performance of these algorithms can be impacted by high variance and instability, particularly in environments with noise or sparse rewards. In this paper, we propose a framework to perform statistical online inference for a sample-averaged Q-learning approach. We adapt the functional central limit theorem (FCLT) for the modified algorithm under some general conditions and then construct confidence intervals for the Q-values via random scaling. We conduct experiments to perform inference on both the modified approach and its traditional counterpart, Q-learning using random scaling and report their coverage rates and confidence interval widths on two problems: a grid world problem as a simple toy example and a dynamic resource-matching problem as a real-world example for comparison between the two solution approaches.